I love Free Software, and the people behind it. Even though I have been a free software user and developer for more than a decade now, there's still more to be amazed at, still more to contribute, and still more to learn. I love the Free Software community, because it never ceases to amaze me. It's not just about the technical side, but the social, human side of it, too!During the past year, I came to another milestone in my career: as of july first, I am a Free Software developer, and enjoying every moment of it. This is a feat that could not have been done without there being demand for what I do, without the free software community as a whole. For that, I owe you all a big thank you, you make it possible for me to make my dream come true. I can only hope that through my work, and through my example, I can contribute back enough.To this day, I take delight in contributing to free software, be it a Lisp on Python (or projects built in it), any of my own projects, syslog-ng, Google Summer of Code mentoring, Debian work - any and all of that are great fun. Even processing NEW is (I really should do that more often)! It's an endless source of not only entertainment and knowledge, but inspiration too.Whenever I read something as amazing as Russ Allbery's mail on the Debian technical committee impasse, I realize just how amazing people we have in our community, and how much we all can learn from them, and how much we owe them. This is yet another reason why I feel that contributing to Free Software is a passion many can share, because it teaches us good values too, atop of being technically excellent.I love Free Software, because without it, I would be lost. Thank you.
$ git clone https://github.com/balabit/syslog-ng-incubator.git
$ cd syslog-ng-incubator
$ autoreconf -i
$ ./configure && make && sudo make install
If anything happens to be installed in a non-standard location, one will need to adjust PKG_CONFIG_PATH
to help the configure script locate the needed libraries.Once installed (the configure script will figure out where syslog-ng modules are, and modules will be put there), syslog-ng will automatically recognise the new modules. One can make sure that this is the case by running the following command after installation:
$ syslog-ng --module-registry
The PluginsNow that we're over the hard part of compiling and installing the Incubator, lets see what is inside! I will start with the easier things, and move on to the more complicated features as we progress. That is, we'll start with some template functions, have a glance at the trigger source, then explore the rss destination, and finish off with the riemann destination.Template functionsThe Incubator gives us three new template functions, some less useful than the others, and one that's a huge, ugly hack for a problem that I ended up solving in a very different way - without the hack.These functions are the $(or)
template function, which takes any number of arguments, and returns the first one that is not empty. The main use case here is that if you have, say, similarly named fields, but some messages have one, the others another, and you want to normalize it, $(or)
is one way to do it:
$(or $ HOST $ HOSTNAME $ HOST_NAME "<unknown>")
Another function is $(//)
, which does the same thing as the built-in $(/)
template function: divide its arguments. Except this one works with floating-point numbers only, while the built-in one is for integers exclusively. Using it is simple, too:
$(// $ some_number 3.4)
The last template function provided by the Incubator is $(state)
, which can be used to maintain global state, that does not depend on log messages. You can set values in here, like counters, from within a template function. It is possible to count the total amount downloaded data when processing a HTTP server log, for example. But it's slow, and there are better ways to do the same thing, syslog-ng really isn't the best tool for this kind of job. If anyone happens to find a use-case for it, please let me know. As for using it, it has two modes: set (with two arguments) and get (with one):
$(state some-variable $ VALUE )
$(state some-variable)
Trigger sourceThe trigger source has many in common with the built-in mark feature: at given intervals, it sends a message. This is mostly a debugging aid, when you want to generate messages without an external tool. It only has two options: trigger-freq()
and trigger-message()
, which default to 10 and "Trigger source is trigger happy.", respectively. It also accepts a number of common source options such as program-override()
, host-override()
and tags()
.To use it, one just needs to set it up like any other source, and bind it to a destination with a log statement:
source s_trigger
trigger(
program-override("trigger")
tags("trigger-happy")
trigger-freq(5)
trigger-message("Beep.")
);
;
Without a program-override()
option, messages will be attributed to syslog-ng, which is likely not what you want, even while debugging. Internal messages are usually routed somewhere else.RSS destinationThe RSS destination is an interesting beast. It offers an Atom feed of the last hundred messages routed to the destination. I could very well imagine this being useful in a situation where one already has monitoring set up to listen on various RSS sources - this would be just another. It also works well with most RSS feed readers. The length of the feed is not configurable at this time, and the number of options is limited to port()
, title()
, entry-title()
and entry-description()
.The first one specifies which port the destination should listen on (it serves one client at a time!); title()
can be used to set the title of the feed itself, while entry-title()
and entry-description()
can be used with templates to fill in the per-message Atom entries.Once we have a suitable path we want to route to the RSS destination (such as critical error messages only), we can set it up like this:
destination d_rss
rss(
port(8192)
feed-title("Critical errors in the system")
entry-title("Error from $ PROGRAM @ $ HOST_FROM at $ ISODATE ")
entry-description("$ MESSAGE ")
);
;
Riemann destinationBeing the original motivator for the Incubator, I left this last. This module is the interface between your logs and the Riemann monitoring system. With this, you can take all the legacy applications that are hard to monitor, but provide log files, use syslog-ng's extraordinary log processing power, and send clear and concise events over to Riemann.One can use it to monitor logins, downloads, uploads, exceptions - pretty much anything. Just extract some metric, or state, send it over to Riemann, and that will do the heavy lifting. What exactly can be done, will be worth a separate blog post, so for now, I will give just a very tiny example:
destination d_riemann
riemann(
ttl("120")
description("syslog-ng internal errors")
metric(int("$ SEQNUM "))
);
;
Hook it up to a path that collects syslog-ng internal messages, keeps only error messages, and routes it toward this destination:
log
source internal(); ;
filter level(err..emerg); ;
destination(d_riemann);
;
The destination itself has the following options:
server()
: The server to connect to, defaults to localhost.port()
: The port the Riemann server is listening on, defaults to 5555.type()
: The type of connection: UDP or TCP. Defaults to TCP.host()
: The host field of the Riemann event, defaults to $ HOST
.service()
: The service field of the Riemann event, defaults to $ PROGRAM
state()
: The state field of the Riemann event, without a default.description()
: The description of the event, with no default.ttl()
: The time-to-live of the event, no default.metric()
: The metric to send to Riemann. This needs to be either an integer or a floating point number. Using type-hinting is advised here. Without one, the destination will try to parse the value of this option as a float.tags()
: As the name implies, this adds tags to the Riemann event. By default, all the tags that are set on the message, will be forwarded.attributes()
: With this option, one can set custom attributes on the Riemann event. The syntax is the same as for value-pairs()
, with a few enhancements. What the difference is, is left as an exercise to the reader: the example config that comes with the Incubator has a hint./var/log/apache/*.log
, and it will automatically notice new files being created or old ones disappearing. This is a very useful feature which has been missing from the open source edition, and one that is being worked on, but is far from complete, and will not make it into syslog-ng 3.5. However, at least on GNU/Linux, there is a way to implement something similar, although in a fairly crude manner. Yet, it works surprisingly well in most cases. You only need to abuse a few things...The trick itself is pretty easy, and is based on the same underlying kernel feature that syslog-ng PE uses under the hood: inotify(7)
. What we will do, is use incron to monitor a directory, and create syslog-ng config file snippets whenever a file creation event occurs. We also set up syslog-ng to include these snippets, and reload the configuration after each event received. This way, we immediately notice newly created logfiles without the need to poll the directory, and with config snippets created on the fly, we do not modify the main syslog-ng configuration file, either.To do this, we first need to create an appropriate syslog-ng configuration:
@version: 3.5 destination d_all_logs file("/var/log/all.log"); ; include "/etc/syslog-ng/conf.d/notify-*.conf";The config itself is a dummy: it has no sources, no logpaths, just a destination, which we will use in the snippets. Next, we create the script that will create these for us! It takes a single command-line argument: the filename of the newly created logfile.
d_all_logs
destination defined in the main configuration file, and reloads the configuration afterwards.Now, all we need is to tell incron
to call this script. Assuming we saved it as /usr/local/sbin/syslog-ng-wildcard-notify
, this is how the incrontab entry would look like:
/var/log/remote.d/*.log IN_CREATE,IN_MOVED_TO /usr/local/sbin/syslog-ng-wildcard-notify $@/$#And that's about it! Now whenever a new file gets dropped into the directory, syslog-ng will get notified. There is a downside, however: when a file gets deleted, the configuration is not removed. That part is not so simple, though: we want syslog-ng to finish reading the file before reloading the config, so that makes it a bit trickier. But if we don't care about that, we could just add an
IN_DELETE
event, and write another script that looks up which snippet is responsible for the file, delete it and reload. Doing that is left as an exercise for the reader!
file()
and pipe()
sources only.Indented multi-lineThe easiest variant is indented multi-line, where each line can be followed by others, indented by whitespace, and the message continues until the first non-indented line. This is the format used by the Linux kernel too, from version 3.5, for /dev/log
. This type of multi-line can be used as follows:
source s_multiline file("/path/to/file" multi-line-mode(indented)); ;Consider that
/path/to/file
has the following content:
First line Continuation 1; Continuation 2; Second lineWith the indented
multi-line-mode()
setting, this would turn into two log messages:
First line\n Continuation 1;\n Continuation 2; Second lineRegexp-based multilineIf multi-line input is not based on indentation, one can use the regexp
multi-line-mode()
instead, which makes two new settings available: multi-line-prefix()
and multi-line-garbage()
. These can be used to define the start and the end of a log message: any string between a the beginning matching prefix and a matching garbage will be considered a single message. That is, the prefix will be included, the garbage will not be: it will be discarded.To illustrate:
source s_multiline file("/path/to/file" multi-line-mode(regexp) multi-line-prefix("^prefix") multi-line-garbage(" garbage$")); ;If the source contains these lines:
prefix message continuing garbageThis will turn into the following log message:
prefix message\ncontinuingNew destinationsIt would be pretty hard to do a syslog-ng release without new destinations, and with 3.5, we will have three of them!STOMPWhile we had AMQP before, we now have a STOMP destination too, so we can stream logs to any STOMP server, with a few simple lines:
stomp( host("localhost") port(61613) destination("/topic/syslog") routing_key("") persistent(yes) ack(no) username("someone") password("something") body("This is the body! YAY! Here's a message: $MESSAGE!\n") value-pairs(scope(nv-pairs, syslog-proto, selected-macros)) )Of course, none of these settings need to be set, you can just use the defaults, and it will just work!RiemannDid you know that syslog-ng is far more than a log collector and processor? No? You do now. When you have access to a lot of logs, and a tremendous amount of power in parsing them, you can use these tools for monitoring easily! And when we're talking monitoring, riemann is a great asset in our toolbox, and with the new destination, we can easily forward metrics to it:
riemann( server("localhost") port(5555) host("$HOST") service("$PROGRAM") description("$PROGRAM pids") metric(int("$PID")) ttl(300) );Of course, the above is a very silly example, it would make much more sense to extract data from a log message instead, but this will serve as an example.New template functions
$(upper-case TEXT...)
, $(lower-case TEXT...)
$(delimit DELIMITERS NEW-DELIMITER TEXT)
template("$(delimit \"\t \" \" \" $MESSAGE)\n")
$(env VARIABLE...)
mongodb()
destination and the $(format-json)
template function for now. When no type hint is specified, syslog-ng defaults to string.To add type hints, simply wrap the respective template with the hinted type, like this:
mongodb( value-pairs(pair("date", datetime("$UNIXDATE")) pair("pid", int64("$PID"))) );Currently the following type hints exist: boolean (anything that begins with a t or 1 is true, anything that begins with f or 0 is false, everything else is an error), string, literal (same as string, but not quoted if it would be quoted otherwise), int32 (int is an alias for this), int64, and datetime. Only UNIX timestamps can be type-hinted to datetime, anything else will likely result in a casting error.It is also possible to control what happens when type casting fails: syslog-ng can drop the whole message, drop the property, or fall back to string. It can also do all of these silently:
options typecast(on-error(silently-drop-property)); ;Using this feature with
$(format-json)
is very similary too:
$(format-json date=datetime("$UNIXDATE") pid=int64("$PID"))With this feature in place, you can now store your non-string values with their proper types!Unit suffixesUnit suffixes make it considerably easier to set limits and describe numbers within the syslog-ng configuration. We no longer need to spell out sizes to the byte precious, it is now enough to write:
log-fifo-size(200MiB)
. Now, syslog-ng will understand suffixes for kilo-, mega-, and giga-bytes, (K, M, G, respectively) either in base-10 or base-2 (with an extra i after the suffix). One can also omit the trailing b from the end.So, to set the log-fifo-size()
to 2097152 bytes, one can simply use 2MiB. Or, to set it to 2000000, 2Mb. That's a whole lot easier, isn't it? No more counting zeros, no more silly typos in a ten-digit number, no more pain, but easily readable units!Miscellaneous featuresApart from the features above, there have been a lot of other changes and improvements in the code base:
username()
and password()
settings of the
mongodb()
driver), some were renamed and deprecated:
the replace()
key transformation function of
value-pairs()
was renamed to
replace-prefix()
, as that makes the intent clearer. In
this latter case, the old name is still valid, but obsoleted.
in-list
, with which one
can implement efficient white- or blacklists.
To use it, you will need a file with one value a line, and do
something along these lines:
filter f_whitelist in-list("/path/to/file.list", value("PROGRAM")); ;To do a blacklist, just negate the filter:
filter f_blacklist not in-list("/path/to/file.list", value("HOST")); ;
/dev/kmsg
, and
will use that instead of /proc/kmsg
when a sufficiently
recent kernel is detected (assuming one is using the
system()
source).
The new kernel log format supports structured messages, and
syslog-ng is smart enough to parse them, and make them accessible
like all other message properties (with a .linux. prefix).
$ deeper.level
within a syslog-ng template.The parsed keys can be prefixed using the prefix()
option of the json-parser. One can also change what the parser will receive as input by using the template()
option.See the documentation for more information on these.Transporting raw JSONThe simplest scenario is to take a transport like TCP, and simply push raw JSON structures through, one each line, and have syslog-ng parse that to its internal representation. One can do all kinds of things with the data then: transport it further as JSON, store it in files, in a database - the possibilities are endless, but we will explore two scenarios.JSON on TCP, using syslog field namesThe first one is receiving JSON on TCP, using syslog field names, store the parsed stuff in a file (using the conventional format), and in mongodb.
source s_tcp_json tcp(port(10514) flags(no-parse)); ; parser p_json json-parser(); ; destination d_file file("/var/log/remote.log"); ; destination d_mongodb mongodb(collection("remote_log")); ; log source(s_tcp_json); parser(p_json); destination(d_file); destination(d_mongodb); ;This needs a bit of an explanation: we use the no-parse flag on the source, because by default, the tcp() source expects a message that conforms to the old syslog protocol. We have raw JSON, and we need to tell the source that. The other elements should be self explanatory, I believe, with the exception of json-parser(): we give it no parameters, so it will not modify the keys in the JSON it receives, keeping them intact, as-is.For this reason, incoming messages need to have the same keys as syslog-ng normally would use internally: DATE, HOST, MESSAGE, and so on, like in the following example:
source s_tcp_json tcp(port(10514) flags(no-parse)); ; parser p_json json-parser(prefix(".json.")); ; destination d_file file("/var/log/remote.log" template("$ .json.timestamp $ .json.source $ .json.app [$ .json.id ]: $ .json.msg \n")); ; destination d_mongodb mongodb(collection("remote_log") value-pairs( pair("PROGRAM" "$ .json.app ") pair("HOST" "$ .json.source ") pair("PID" "$ .json.id ") pair("MESSAGE" "$ .json.msg ") pair("DATE" "$ .json.timestamp ") pair("PRIORITY" "$ .json.prio ") pair("FACILITY" "auth") )); ; log source(s_tcp_json); parser(p_json); destination(d_file); destination(d_mongodb); ;We use the prefix() option of json-parser(), to avoid collisions with the default syslog-ng namespace. Then, we use a template for the file destination, and value-pairs() for the MongoDB one. Doing all this, we get similar results as we did in the first scenario.JSON over syslogAn even likelier scenario is when we have JSON payload in the message part of a normal syslog message. This is useful because the logs themselves remain syslog compatible, and can be transported through devices that know no better.Thankfully, if we do not tell syslog-ng that we do not want any parsing on the input, the json-parser will receive the MESSAGE part of the log, so if we have JSON payload there, things will just work:
source s_tcp_json_payload tcp(); ; parser p_json json-parser(); ; destination d_file file("/var/log/remote.log"); ; destination d_mongodb mongodb(collection("remote_log")); ; log source(s_tcp_json_payload); parser(p_json); destination(d_file); destination(d_mongodb); ;Again, this assumes that the JSON payloads uses field names compatible with syslog-ng's defaults. Adapting it to use templates or value-pairs can of course be made, as shown in an example above.As an example, here's how a syslog message with JSON payload would look like:
<38>2013-07-22T13:45:54 localhost prg00000[1234]: "MESSAGE": "foo", "PROGRAM": "bar"This would, of course, override the MESSAGE and PROGRAM fields, but the rest would be kept intact.Mixed JSON and syslog streamIn order to parse JSON that is intermingled with normal syslog streams, we can use a new feature of syslog-ng 3.4: junctions. These allow us to create branches, and tell syslog-ng something along these lines:
If there's an incoming message, that begins with
@json:
, parse it as JSON, otherwise treat it as a normal
syslog message.
We can, of course, skip the part that mandates an in-stream marker, and things will just work, but in that case, for each JSON message, an error will get logged too, which is undesirable. Therefore, if one wants to intermingle JSON and syslog traffic on the same wire, it's best to do so with the JSON having a marker prefix. The marker, however, is not going to be part of the string the json parser receives, so setting it to an opening curly bracket will not work, unless the template is modified accordingly.The config to do this looks like this:
block parser mixed-json-parser() channel junction channel parser json-parser(marker("@json:")); ; rewrite set-tag(".json"); ; flags(final); ; channel flags(final); ; ; ; ; source s_mixed tcp(); parser mixed-json-parser(); ; ;A more elaborate example can be found in a post by Bazsi.Closing thoughtsThat covers most of the use cases I saw for the JSON parser, if there are cases uncovered, or questions unanswered, I can be reached via twitter, email and a whole host of other ways, and I will happily expand this tutorial if there's enough demand for more.
debian/watch
file anyway, when downloading from a host that is not GitHub, therefore his arguments for a raw version-only tag are bogus, unless your upstream is using GitHub.But I do not want to criticise only, rather, to provide a reason why a prefix is used, and why I chose the "worst" prefix: the project name itself.One very strong reason to use a prefix, either v or any other prefix is to ease tab completion. If you have a tag that is a bare number, you type the first, press tab, and get a mixture of commits and tags. You can't easily tell your completion system that you want commits or tags. With a prefix outside of the hex range, which v is, you can do that, and that makes working with it a lot easier.Is it for convenience? Yes. But ask yourself this: how many times do you have to write a debian/watch
file? Once per package. How much time do you spend looking for a tag, or a commit? A lot more. So which one is most important? The convenience of someone who has to work around the tarball naming of the hosting service once, or the developer who works with the software daily? Obviously the latter!But that is not all. I'm an advocate of prefixing the tag with the project name, actually, and I've been doing that for all new projects for a while now, and I'm not going to change that, because it serves a very practical purpose: if you have a parent repository, with a lot of submodules, if git tag tells you the project name too, that makes it much easier to navigate. I don't have to bake submodule support into my shell prompt (that would be costy), and I won't find it surprising that if I enter a subdirectory, git tag
fails to list the version I know I'm working on. Just because I happened to end up in a submodule, which is something I end up doing often, as I have many repositories that have submodules, which in turn have other submodules, and so on and so forth through many layers.A raw version number as a tag name is insufficient for my needs, simple as that. And I, as upstream, don't care that whoever packages my thing, will have to add a line to a watch file. That only needs to be done once, while I work with tags and commits daily.I'm sorry, but calling this naming convention a disease just because GitHub's tarball naming is what it is, without even considering that there may be other reasons behind it than legacy, and that there are hosting sites outside of GitHub, is a mistake. Dear package maintainers, please do not annoy us upstreams with requests to cripple our daily work. Thank you.
dput-ng (1.4) unstable; urgency=low
[ Arno T ll ]
* Really fix #696659 by making sure the command line tool uses the most recent
version of the library.
* Mark several fields to be required in profiles (incoming, method)
* Fix broken tests.
* Do not run the check-debs hook in our mentors.d.n profile
* Fix "[dcut] dm bombed out" by using the profile key only when defined
(Closes: #698232)
* Parse the gecos field to obtain the user name / email address from the local
system when DEBFULLNAME and DEBEMAIL are not set.
* Fix "dcut reschedule sends "None-day" to ftp-master if the delay is
not specified" by forcing the corresponding parameter (Closes: #698719)
.
[ Luca Falavigna ]
* Implement default_keyid option. This is particularly useful with multiple
GPG keys, so dcut is aware of which one to use.
* Make scp uploader aware of "port" configuration option.
.
[ Paul Tagliamonte ]
* Hack around Launchpad's SFTP implementation. We musn't stat *anything*.
"Be vewy vewy quiet, I'm hunting wabbits" (Closes: #696558).
* Rewrote the test suite to actually test the majority of the codepaths we
take during an upload. Back up to 60%.
* Added a README for the twitter hook, Thanks to Sandro Tosi for the bug,
and Gergely Nagy for poking me about it. (Closes: #697768).
* Added a doc for helping folks install hooks into dput-ng (Closes: #697862).
* Properly remove DEFAULT from loadable config blocks. (Closes: #698157).
* Allow upload of more then one file. Thanks to Iain Lane for the
suggestion. (Closes: #698855).
.
[ Bernhard R. Link ]
* allow empty incoming dir to upload directly to the home directory
.
[ Sandro Tosi ]
* Install example hooks (Closes: #697767).
Thanks to all the contributors!
For anyone who doesn t know, you should check out the docs.
Next.